Counting Triangles in Sub-linear Time
ثبت نشده
چکیده
We consider the problem of estimating the number of triangles in a graph. This problem has been extensively studied in two models: exact counting algorithms, which require reading the entire graph, and streaming algorithms, where the edges are given in a stream and the memory is limited. In this work we design a sublinear-time algorithm for approximating the number of triangles in a graph, where the algorithm is given query access to the graph. The allowed queries are degree queries, vertex-pair queries and neighbor queries. The problem of counting triangles in sublinear time was previously considered in the work of Gonen et al. (SIAM Journal on Discrete Mathematics, 2011), whose main focus was on counting the number of stars (of a given size) in sublinear time. They gave a linear lower bound on the running time of any algorithm for approximating the number of triangles in a graph when the algorithm is allowed degree and neighbor queries (and when the number of edges and number of triangles are linear in the number of vertices). Following this lower bound they ask if sublinear-time algorithms can be designed when the algorithm is also allowed vertex-pair queries. We answer this question affirmatively by presenting an algorithm that, for any given approximation parameter 0 < < 1, provides an estimation ∆̂ such that with high constant probability, (1 − )∆(G) < ∆̂ < (1 + )∆(G), where ∆(G) is the number of triangles in the graph G. The query complexity of the algorithm is O ( n ∆(G)1/3 + min { m, m 3/2 ∆(G) }) ·poly(log n, 1/ ), where n is the number of vertices in the graph and m is the number of edges. For values of ∆ and m such that ∆ ≥ √ m the running time is of the same order as the query complexity. We also prove lower bounds that show this algorithm is tight up to polylogarithmic factors in n and the dependence on 1/ .
منابع مشابه
Efficient Algorithms for Approximate Triangle Counting
Counting the number of triangles in a graph has many important applications in network analysis. Several frequently computed metrics like the clustering coefficient and the transitivity ratio need to count the number of triangles in the network. Furthermore, triangles are one of the most important graph classes considered in network mining. In this paper, we present a new randomized algorithm f...
متن کاملFast counting of triangles in real-world networks: proofs, algorithms and observations
How can we quickly find the number of triangles in a large graph, without actually counting them?Triangles are important for real world social networks, lying at the heart of the clustering coefficient and of the transitivity ratio. However, straight-forward and even approximate counting algorithms can be slow, trying to execute or approximate the equivalent of a 3-way database join. In this pa...
متن کاملFast Counting of Triangles in Large Real Networks: Algorithms and Laws
How can we quickly find the number of triangles in a large graph, without actually counting them? Triangles are important for real world social networks, lying at the heart of the clustering coefficient and of the transitivity ratio. However, straight-forward and even approximate counting algorithms can be slow, trying to execute or approximate the equivalent of a 3-way database join. In this p...
متن کاملMapReduce vs. Pipelining Counting Triangles
In this paper we follow an alternative approach named pipeline, to implement a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To be concrete, we implement a dynamic pipeline of processes and an ad-hoc MapReduce version using the language Go....
متن کاملCounting and Sampling Triangles from a Graph Stream
This paper presents a new space-efficient algorithm for counting and sampling triangles—and more generally, constant-sized cliques—in a massive graph whose edges arrive as a stream. Compared to prior work, our algorithm yields significant improvements in the space and time complexity for these fundamental problems. Our algorithm is simple to implement and has very good practical performance on ...
متن کامل